shadow removal
MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
Kim, Chaewon, Lee, Seoyeon, Park, Jonghyuk
Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. T o effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HF AM) that decomposes and adap-tively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that Matte-ViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.69)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models
Qiu, Xingyu, Yang, Mengying, Ma, Xinghua, Liang, Dong, Li, Yuzhen, Li, Fanding, Luo, Gongning, Wang, Wei, Wang, Kuanquan, Li, Shuo
EDM elucidates the unified design space of diffusion models, yet its fixed noise patterns restricted to pure Gaussian noise, limit advancements in image restoration. Our study indicates that forcibly injecting Gaussian noise corrupts the degraded images, overextends the image transformation distance, and increases restoration complexity. To address this problem, our proposed EDA Elucidates the Design space of Arbitrary-noise-based diffusion models. Theoretically, EDA expands the freedom of noise pattern while preserving the original module flexibility of EDM, with rigorous proof that increased noise complexity incurs no additional computational overhead during restoration. EDA is validated on three typical tasks: MRI bias field correction (global smooth noise), CT metal artifact reduction (global sharp noise), and natural image shadow removal (local boundary-aware noise). With only 5 sampling steps, EDA outperforms most task-specific methods and achieves state-of-the-art performance in bias field correction and shadow removal.
- Asia > China > Heilongjiang Province > Harbin (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- North America > United States (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal
Liu, Wenjie, Wang, Bingshu, Wang, Ze, Chen, C. L. Philip
Document shadow removal is a crucial task in the field of document image enhancement. However, existing methods tend to remove shadows with constant color background and ignore color shadows. In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion. It translates shadow images from pixel space to latent space, enabling the model to more easily capture essential features. To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM). It is able to produce accurate shadow mask and add noise into shadow regions specially. Guided by the shadow mask, a shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process. We also propose a shadow-robust perceptual feature loss to preserve details and structures in document images. Moreover, we develop a large-scale synthetic document color shadow removal dataset (SDCSRD). It simulates the distribution of realistic color shadows and provides powerful supports for the training of models. Experiments on three public datasets validate the proposed method's superiority over state-of-the-art. Our code and dataset will be publicly available.
- Asia > China > Shaanxi Province > Xi'an (0.05)
- Asia > Macao (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- (5 more...)
Latent Feature-Guided Diffusion Models for Shadow Removal
Mei, Kangfu, Figueroa, Luis, Lin, Zhe, Ding, Zhihong, Cohen, Scott, Patel, Vishal M.
Motivated by the success of diffusionbased Recovering textures under shadows has remained a challenging image restoration models [38, 41], we adapt diffusion problem due to the difficulty of inferring shadowfree models for the task of shadow removal by conditioning on scenes from shadow images. In this paper, we propose the input shadow image and corresponding shadow mask as the use of diffusion models as they offer a promising approach a baseline approach to generate shadow-free images. However, to gradually refine the details of shadow regions preserving and generating high-fidelity textures and during the diffusion process. Our method improves this colors in the shadow region after removal is non-trivial. The process by conditioning on a learned latent feature space baseline model appears to favor borrowing textures from that inherits the characteristics of shadow-free images, thus the surrounding non-shadow areas rather than focusing on avoiding the limitation of conventional methods that condition restoring the original details underneath the shadow, which on degraded images only. Additionally, we propose results in incorrect color mixtures and loss of detail in the to alleviate potential local optima during training by fusing shadow region. In Figure 1, we show one of the representative noise features with the diffusion network. We demonstrate issues of image-mask conditioning, i.e., the model synthesizes the effectiveness of our approach which outperforms results containing an incorrect color mixture.